A comparative analysis of algorithms for somatic SNV detection in cancer
نویسندگان
چکیده
MOTIVATION With the advent of relatively affordable high-throughput technologies, DNA sequencing of cancers is now common practice in cancer research projects and will be increasingly used in clinical practice to inform diagnosis and treatment. Somatic (cancer-only) single nucleotide variants (SNVs) are the simplest class of mutation, yet their identification in DNA sequencing data is confounded by germline polymorphisms, tumour heterogeneity and sequencing and analysis errors. Four recently published algorithms for the detection of somatic SNV sites in matched cancer-normal sequencing datasets are VarScan, SomaticSniper, JointSNVMix and Strelka. In this analysis, we apply these four SNV calling algorithms to cancer-normal Illumina exome sequencing of a chronic myeloid leukaemia (CML) patient. The candidate SNV sites returned by each algorithm are filtered to remove likely false positives, then characterized and compared to investigate the strengths and weaknesses of each SNV calling algorithm. RESULTS Comparing the candidate SNV sets returned by VarScan, SomaticSniper, JointSNVMix2 and Strelka revealed substantial differences with respect to the number and character of sites returned; the somatic probability scores assigned to the same sites; their susceptibility to various sources of noise; and their sensitivities to low-allelic-fraction candidates. AVAILABILITY Data accession number SRA081939, code at http://code.google.com/p/snv-caller-review/ CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Mutation Discovery in Regions of Segmental Cancer Genome Amplifications with CoNAn-SNV: A Mixture Model for Next Generation Sequencing of Tumors
Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumor genome-in particular single nucleotide variants (SNVs). Most current computational and statistical models for analyzing next generation sequencing data, however, do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs)-w...
متن کاملFaSD-somatic: a fast and accurate somatic SNV detection algorithm for cancer genome sequencing data
UNLABELLED Recent advances in high-throughput sequencing technologies have enabled us to sequence large number of cancer samples to reveal novel insights into oncogenetic mechanisms. However, the presence of intratumoral heterogeneity, normal cell contamination and insufficient sequencing depth, together pose a challenge for detecting somatic mutations. Here we propose a fast and an accurate so...
متن کاملComparative Analysis of Machine Learning Algorithms with Optimization Purposes
The field of optimization and machine learning are increasingly interplayed and optimization in different problems leads to the use of machine learning approaches. Machine learning algorithms work in reasonable computational time for specific classes of problems and have important role in extracting knowledge from large amount of data. In this paper, a methodology has been employed to opt...
متن کاملDetection of Somatic Mutation in Exon 12 of DNA Polymerase β in Ovarian Cancer Tissue Samples
Background: DNA polymerase β (pol β) is a key enzyme of base excision repair pathway. It is a 1-kb gene consisting of 14 exons. Its catalytic part lies between exon 8 and exon 14. Exon 12 has a role in deoxyribonucleotide triphosphate selection for nucleotide transferase activity. Methods: Genomic DNA was isolated from ovarian carcinoma samples. Single strand conformation polymorphism...
متن کاملApplying Two Computational Classification Methods to Predict the Risk of Breast Cancer: A Comparative Study
Introduction: Lack of a proper method for early detection and diagnostic errors in medicine are some fundamental problems in treating cancer. Data analysis techniques may significantly help early diagnosis. The current study aimed at applying and evaluating neural networks and decision tree algorithm on breast cancer patients’ data for early cancer prediction. Methods: In the current stu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 29 شماره
صفحات -
تاریخ انتشار 2013